Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available October 6, 2026
-
In safety-critical robotic systems, perception is tasked with representing the environment to effectively guide decision-making and plays a crucial role in ensuring that the overall system meets its requirements. To quantitatively assess the impact of object detection and classification errors on system-level performance, we present a rigorous formalism for a model of detection error, and probabilistically reason about the satisfaction of regular-safety temporal logic requirements at the system level. We also show how standard evaluation metrics for object detection, such as confusion matrices, can be represented as models of detection error, which enables the computation of probabilistic satisfaction of system-level specifications. However, traditional confusion matrices treat all detections equally, without considering their relevance to the system-level task. To address this limitation, we propose novel evaluation metrics for object detection that are informed by both the system-level task and the downstream control logic, enabling a more context-appropriate evaluation of detection models. We identify logic-based formulas relevant to the downstream control and system-level specifications and use these formulas to define a logic-based evaluation metric for object detection and classification. These logic-based metrics result in less conservative assessments of system-level performance. Finally, we demonstrate our approach on a car-pedestrian example with a leaderboard PointPillars model evaluated on the nuScenes dataset, and validate probabilistic system-level guarantees in simulation.more » « lessFree, publicly-accessible full text available October 15, 2026
-
Modern cyber-physical systems-of-systems (CPSoS) operate in complex systems-of-systems that must seamlessly work together to control safety- or mission-critical functions. Linear Temporal Logic (LTL) and Mission-time Linear Temporal logic (MLTL) intuitively express CPSoS requirements for automated system verification and validation. However, both LTL and MLTL presume that all signals populating the variables in a formula are sampled over the same rate and type (e.g., time or distance), and agree on a standard “time” step. Formal verification of cyber-physical systems-of-systems needs validate-able requirements expressed over (sub-)system signals of different types, such as signals sampled at different timescales, distances, or levels of abstraction, expressed in the same formula. Previous works developed more expressive logics to account for types (e.g., timescales) by sacrificing the intuitive simplicity of LTL. However, a legible direct one-to-one correspondence between a verbal and formal specification will ease validation, reduce bugs, increase productivity, and linearize the workflow from a project’s conception to actualization. Validation includes both transparency for human interpretation, and tractability for automated reasoning, as CPSoS often run on resource-limited embedded systems. To address these challenges, we introduced Mission-time Linear Temporal Logic Multi-type (Hariharan et al., Numerical Software Verification Workshop, 2022), a logic building on MLTL. MLTLM enables writing formal requirements over finite input signals (e.g., sensor signals and local computations) of different types, while maintaining the same simplicity as LTL and MLTL. Furthermore, MLTLM maintains a direct correspondence between a verbal requirement and its corresponding formal specification. Additionally, reasoning a formal specification in the intended type (e.g., hourly for an hourly rate, and per second for a seconds rate) will use significantly less memory in resource-constrained hardware. This article extends the previous work with (1) many illustrated examples on types (e.g., time and space) expressed in the same specification, (2) proofs omitted for space in the workshop version, (3) proofs of succinctness of MLTLM compared to MLTL, and (4) a minimal translation to MLTL of optimal length.more » « less
-
Autonomous cyber-physical systems must be able to operate safely in a wide range of complex environments. To ensure safety without limiting mitigation options, these systems require detection of safety violations by mitigation trigger deadlines. As a result of these system’s complex environments, multimodal prediction is often required. For example, an autonomous vehicle (AV) operates in complex traffic scenes that result in any given vehicle having the ability to exhibit several plausible future behavior modes (e.g., stop, merge, turn, etc.); therefore, to ensure collision avoidance, an AV must be able to predict the possible multimodal behaviors of nearby vehicles. In previous work, model predictive runtime verification (MPRV) successfully detected future violations by a given deadline, but MPRV only considers a single mode of prediction (i.e., unimodal prediction). We design multimodal model predictive runtime verification (MMPRV) to extend MPRV to consider multiple modes of prediction, and we introduce Predictive Mission-Time Linear Temporal Logic (PMLTL) as an extension of MLTL to support the evaluation of probabilistic multimodal predictions. We examine the correctness and real-time feasibility of MMPRV through two AV case studies where MMPRV utilizes (1) a physics-based multimodal predictor on the F1Tenth autonomous racing vehicle and (2) current state-of-the-art deep neural network multimodal predictors trained and evaluated on the Argoverse motion forecasting dataset. We found that the ability to meet real-time requirements was a challenge for the latter, especially when targeting an embedded computing platform.more » « less
-
This paper studies the evaluation of learning-based object detection models in conjunction with model-checking of formal specifications defined on an abstract model of an autonomous system and its environment. In particular, we define two metrics – proposition-labeled and class-labeled confusion matrices – for evaluating object detection, and we incorporate these metrics to compute the satisfaction probability of system-level safety requirements. While confusion matrices have been effective for comparative evaluation of classification and object detection models, our framework fills two key gaps. First, we relate the performance of object detection to formal requirements defined over downstream high-level planning tasks. In particular, we provide empirical results that show that the choice of a good object detection algorithm, with respect to formal requirements on the overall system, significantly depends on the downstream planning and control design. Secondly, unlike the traditional confusion matrix, our metrics account for variations in performance with respect to the distance between the ego and the object being detected. We demonstrate this framework on a car-pedestrian example by computing the satisfaction probabilities for safety requirements formalized in Linear Temporal Logic (LTL).more » « less
-
We study the problem of analyzing the effects of inconsistencies in perception, intent prediction, and decision making among interacting agents. When accounting for these effects, planning is akin to synthesizing policies in uncertain and potentially partially-observable environments. We consider the case where each agent, in an effort to avoid a difficult planning problem, does not consider the inconsistencies with other agents when computing its policy. In particular, each agent assumes that other agents compute their policies in the same way as it does, i.e., with the same objective and based on the same system model. While finding policies on the composed system model, which accounts for the agent interactions, scales exponentially, we efficiently provide quantifiable performance metrics in the form of deltas in the probability of satisfying a given specification. We showcase our approach using two realistic autonomous vehicle case-studies and implement it in an autonomous vehicle simulator.more » « less
-
Object classification is a key element that enables effective decision-making in many autonomous systems. A more sophisticated system may also utilize the probability distribution over the classes instead of basing its decision only on the most likely class. This paper introduces new performance metrics: the absolute class error (ACE), expectation of absolute class error (EACE) and variance of absolute class error (VACE) for evaluating the accuracy of such probabilities. We test this metric using different neural network architectures and datasets. Furthermore, we present a new task-based neural network for object classification and compare its performance with a typical probabilistic classification model to show the improvement with threshold-based probabilistic decision-making.more » « less
An official website of the United States government

Full Text Available